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(57) ABSTRACT 

The inventive system and method is directed toward veri- 
fying the accuracy of data tables specified by a developer to 
be used by a program. The system searches through an 
application program for instructions which access areas of 
memory declared by the developer as being of interest and 
executes instrumentation code for these instructions. Input 
to the program is the source code of a user program and 
optionally, a data coverage specification prepared by a 
developer. Instrumentation can be implemented by inserting 
instrumenting code into the source code prior to compilation 
using facilities within the compiler itself. Alternatively, the 
instrumentation code can be added to the executable pro- 
gram code after compilation is complete. Yet a third option 
involves generating and executing instrumentation during 
execution of the user program without ever modifying the 
user program code at any stage. The output of the system is 
data coverage information indicating the number of times 
that various elements of the data tables of interest have been 
accessed during one full run of the user program. The system 
thereby provides a mechanism for evaluating the integrity of 
data to be accessed by a program where prior instrumenta- 
tion systems have concentrated on verifying the validity of 
program logic flow. 
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SYSTEM AND METHOD FOR DATA a data coverage specification which identifies functions of n 

COVERAGE ANALYSIS OF A COMPUTER interest and memory locations within a range of interest, and 

PROGRAM then instruments program statements which satisfy one or 

TECHNICAL FIELD OF THE INVENTION ° f th£ daU covera S e specification. 

„. . .... 5 In order to determine the number of entries in a data table 

This invention is directed to a method and system for accessed during operation of an application program, the 

providing data coverage analysis of a computer program. areas in computer memory associated with this data table. 

BACKGROUND OF THE INVENTION must be identified - Tne developer lists functions of interest 

~ , and memory locations associated with data tables which are 

Compilers convert a source program that is usually writ- G f interest to the developer. The resulting package of infor- 

ten m a high level language into low level object code, which mation is the data coverage specification. The application 

typically consists of a sequence of machine instructions or program and data coverage specification are provided to the 

assembly language. The constructs in the source program are data coverage instrumentation tool which actually searches 

converted into a sequence of assembly language instruc- through the program looking for program instructions which 

tions. To obtain a reasonable level of confidence in the access memory locations of interest, 

correctness of the program, it is advantageous to test the 35 The existence of the data coverage specification permits 

compiler-generated code on a wide variety of inputs so that the instrumentation tool to concentrate on instructions which 

each and every block of code is exercised. access selected areas of memory, rather than instrumenting 

There are currently a number of tools available for doing a U memory access instructions, thus reducing the workload 

program flow analysis or path coverage analysis of an 2Q of the instrumentation program. Alternatively, all code 

application program. During program development, these which accesses constant data could be instrumented thereby 

instrumenting tools allow a developer to see which control providing greater simplicity to the instrumenting algorithm, 

paths are executed in his program. The tool instruments the but also incurring the additional processing time of instru- 

code, adding monitoring capabilities so that it can determine menting a greater total number of instructions, 

which blocks of code get executed, and how often. 25 In a preferred embodiment, the data coverage specifica- 

After running the application program along with the tion identifies both functions of interest which can be 

instrumenting tool, there is usually some kind of visualiza- mapped to code regions of interest as well as data tables to 

tion facility which color codes sections of code in the be checked which are located in memory areas of interest, 

application program based upon the frequency of execution Mapping of function names to code areas of interest 

of each code section. A product called PURE COVER- 30 requires mapping information connecting the function 

AGE™ (available from Rational Software Corporation of names to the code areas in memory. Such mapping infor- 

Lexington, Mass.) provides such a service. It has been mation is commonly found in executable image files. In an 

observed that code which is tested during the development alternative embodiment, if such function to memory location 

phase is less likely to contain bugs than code developed mapping is unavailable, the functions to be instrumented 

without such testing. ^ 35 could be identified by explicitly stating the addresses where 

A related issue arose in a commonly available processor the functions are found. This latter approach is however, 

that included a floating point divide bug where a large data more inconvenient for the developer, 

table driving the division algorithm contained some incor- In a preferred embodiment, the mechanism executes a two 

rect values. The testing employed at the time failed to detect phase process for keeping track of access or reads from data 

the errors because there was no exhaustive test of all 40 tables of interest. The first phase involves instrumenting 

elements in the data table. only instructions associated with functions of interest, such 

This example exposes a problem in the art. Whereas the instructions being instructions of interest. The mechanism 

path flow analysis may have been straightforward, well then acts to determine whether the instruction of interest 

tested, and have contained no errors, no comparable analysis accesses or may access a memory region of interest. If the 

was performed on the data tables required for a floating point 45 instruction either does not access memory, or accesses 

divide operation. Consequently, an incorrect entry in this memory which is definitely outside the memory region of 

table went undetected by whatever instrumenting techniques interest, no further action is taken. If the instruction of 

were applied to that program. interest either may read from or definitely reads from a 

This experience therefore demonstrates a shortcoming of memory location of interest, the second phase of the 
the path flow analysis technique. Even if the logic flow of an so memory access tracking is activated which is preferably the 

application program is thoroughly analyzed, and found to be insertion of dynamic tracing code. 

correct by an instrumenting program, there remains the Preferably, the dynamic tracing code determines whether 

possibility that the program could malfunction upon execu- the instruction of interest identified in phase one as possibly 

tion because there has not been a comprehensive evaluation accessing the memory region of interest in fact accesses this 
of the correctness of entries in the data table employed by 55 region. If the dynamically traced instruction is ultimately 

the application program. found not to access the memory region of interest, no further 

Therefore, there is a need in the art for a method and action is taken with regard to that instruction. If the dynami- 

system for identifying, after execution of a program which call Y trac ed instruction does in fact access a memory region 

accesses data tables, the number of times each element of of interest, the counter for the data element in the region of 
each data table was accessed. 60 interest accessed by the instruction is appropriately incre- 

There is a further need in the art for a method and system mented. The code added by the instrumentation tool will 

for identifying elements in data tables which have not been execute along with the application program in which it is 

accessed at all. embedded, and create an auxiliary table containing coverage 

SUMMARY OF THE INVENTION 65 ^ Enteral g * ° PCrati0nS tV ° m the 

These and other objects, features and technical advan- By way of example, if the memory region of interest is a 

tages are achieved by a system and method which generates table of 100 data items, then a data coverage table would be 
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created which also contained 100 elements, with each ele- 
ment having a counter initialized to "0" and which corre- 
sponds to a data element in the memory region of interest for 
the purpose of keeping count of the number of times that 
data element gets accessed during execution. For the pup- 5 
pose of this example, let us assume that the instruction 
"ADD" is associated with a function of interest. Encounter- 
ing an "ADD" instruction will trigger further examination of 
the instruction. Now that the current instruction is known to 
be an instruction of interest, it remains to determine whether^ 
the instruction accesses a memory region of interest. If the 10 
instruction does not access a memory region of interest, no 
further action is taken. 

If the instruction does access a memory region of interest; 
the data element in the region which has been accessed is^ 
identified along with its counterpart in the data coverage 15 
table. The appropriate element in the data coverage table is 
then incremented to reflect the read operation performed by 
the instruction of interest. This and other counters will be 
similarly incremented as subsequent instructions of interest^ 
tLare found to read from memory regions of interest. 20 
| Upon completion of execution, each counter would have 
I a value equal to the total number of times that memory 
{^location was accessed. Any counter having a value of 0 after 
program execution would trigger attention from the 
developer, since the memory location associated with that^ 25 
counter has not yet been tested. Counter value data is then 
dumped out to a coverage file after execution of the instru- 
mented program. There is a facility to merge the data 
coverage files resulting from different runs of the instru- 
mented program. 

Next, this coverage information is read from the merged 
/ file using a visualization tool which displays the number of 
I times each element in the data table has been accessed. The 
\ visualization tool acts to more clearly illustrate the number 
of times each element in the table has been accessed. One 
/ approach to visualization would be to represent different 
/ ranges of access in different colors. In a preferred 
/ embodiment, Black would be used to indicate a high access 
I level, Pink to indicate a low access level, and Red to indicate 
unaccessed items. 

The above approach will identify for the developer, 
elements in the data table which have not been accessed by 
the application program in the course of running the test 
suite. With this information, the developer may either 
modify the test suite to ensure that all elements in the table 
are accessed, or examine the unaccessed elements by hand 
to ensure that they are correct. 

Therefore, it is a technical advantage of the present 
invention that the number of accesses to each element in data 
tables of interest during execution of a program is identified. 

It is a further technical advantage of the present invention 
that elements in data tables of interest which have not been 
accessed at all are identified. 

The foregoing has outlined rather broadly the features and 
technical advantages of the present invention in order that 
the detailed description of the invention that follows may be 
better understood. Additional features and advantages of the 
invention will be described hereinafter which form the 
subject of the claims of the invention. It should be appre- 
ciated by those skilled in the art that the conception and 
specific embodiment disclosed may be readily utilized as a 
basis for modifying or designing other structures for carry- 
ing out the same purposes of the present invention. It should 
also be realized by those skilled in the art that such equiva- 
lent constructions do not depart from the spirit and scope of 
the invention as set forth in the appended claims. 



BRIEF DESCRIPTION OF THE DRAWINGS 

For a more complete understanding of the present 
invention, and the advantages thereof, reference is now 
made to the following descriptions taken in conjunction with 
the accompanying drawings, in which: 

FIG. 1A depicts an instrumenting compiler implementa- 
tion of the inventive system according to a preferred 
embodiment of the present invention; 

FIG. IB depicts a static instrumentor implementation of 
the inventive system according to a preferred embodiment of 
the present invention; 

FIG. 1C depicts a dynamic instrumentor implementation 
of the inventive system according to a preferred embodiment 
of the present invention; 

FIG. 2 depicts a data coverage specification according to 
a preferred embodiment of the present invention; 

FIG. 3A depicts the process of inserting instrumentation 
code into an application program using the first two inven- 
tive mechanisms of the present invention; 

FIG. 3B depicts the process on inserting instrumentation 
code into an application program using the third inventive 
mechanism of the present invention; 

FIG, 3C depicts the emulation of program instructions 
using instrumentation according to the third inventive 
mechanism of the present invention; 

FIG. 4 depicts the program instrumentation procedure for 
both the instrumenting compiler and static instrumentor 
according to a preferred embodiment of the present inven- 
tion; 
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FIG. 5 depicts the execution of dynamic tracing code for 
both the instrumenting compiler and static instrumentor 
according to a preferred embodiment of the present inven- 
tion; 

FIG. 6 depicts the instrumentation procedure for the 
dynamic instrumentor; and 

FIG. 7 depicts a computer system adapted to use the 
40 present invention. 

DESCRIPTION OF THE INVENTION 

FIG. 1A depicts an implementation of the inventive 
instrumentor in an instrumenting compiler 100 according to 
a first embodiment of the present invention. In this firsTl 
embodiment, the program source code 101, and the data 
coverage specification 102 are provided to the enhanced 
compiler having an integrated data coverage analysis com- 
ponent (DC A) or "instrumented compiler" 104. It is noted 
here that the invention can be practiced without the data 
coverage specification 102. The specification 102 is 
included however because it makes the instrumentation 
process faster and more efficient. Without the data coverage 
specification 102, all instructions would have to be 
instrumented, making the instrumented program run much 
more slowly. The process of screening data access instruc- 
tions for instrumentation using a data coverage specification" 
is discussed in greater detail in connection with FIG. 2. 

With the instrumenting compiler 100, instrumentation is 
performed within the compiler itself rather than being per- 
formed after compilation, or during execution by a dynamic 
instrumentation tool. 

In this embodiment, the inventive instrumentor function 
103 is integrated into the enhanced compiler 104 in the form 
of the data coverage analysis component. The compiler 104 
then operates on the source code 101 in combination with 
the data coverage specification 102, and produces the instru- 
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merited executable image file 105. The compiler 104 then embodiment of the present invention, which is the preferred 

prelerably generates executable code which performs the embodiment. As in the second mechanism (the Static 

instrumentation operations as well as the operations of the Instrumenter), the program source code 101 is fed into the 

onginal apphcation program. traditiona , aM wi(h wg ^ 

of tl I" ? 1Ilstn ! mented com P ller embodiment 5 facilities provide useful information about the data tables of 

for fill W ^ ^ T eene ,, ate maChme intCresL ™ e com P ikr 108 P'° duce s ^ executable ,mage 

tor the functions in the source code as well as generate code with Hehm, T ;«f^™ iao . ™ lUi ^ c 

for checking reads from regions of interest in memory. ^^^T^° D 109 " ° U *? L V™:*? 

Twhen appropriate, the instrumented compiler 104 creates a Z ^ll ^ ' ^ ^TV? ™ ™ mt ° 

counter to maintain track of access to data elements within^ dyDamiC mstrumentatlo n ^ Ul which contains the 

a particular region of memory. The executable code associ- 10 inventlve ^nimcntor 103. 

ated with program functions and with the instrumentation ,n contrast to the Static Instrumentation Tool 110, the 

^operations becomes part of the executable image 105. The \0 Dynamic Instrumentation Tool 111 does not generate an 

instrumentation operations are discussed in greater detail in instrumented executable file. Instead, the dynamic instru- 

Jhe discussions of FIGS. 4 and 5. mentation tool 111 executes the functions of the executable 

Execution of the instrumented executable image 105 15 image, and simultaneously looks for memory accesses to 

causes the standard application program functions contained regions of interest as indicated by the data coverage speci- 

in the program source code 101 to execute, and also gen- fication 102. Where memory accesses to regions of interest 

erates coverage information 106 indicating how many limes are found in the executable image 109, the dynamic instru- 

the various elements of data tables specified in the data mentation tool 111, in addition to executing the original 

coverage specification 102 were accessed during execution. 20 program instructions, executes instrumenting instructions 

The coverage information 106 is then fed into the visual- which are incorporated into the dynamic instrumentation 

ization tool 107 which correlates the information with the t0 °l itself The dynamic instrumentation tool 111 does not 

program source code 101, and displays the coverage infor- add code to the executable image. 

mation 106 in a visually illustrative manner. The visual Execution of the instrumenting instructions by the 

depiction of the coverage information is discussed in greater 25 dynamic instrumentation tool 111 generates coverage infor- 

detail after the discussion of FIG. 5. mation 106 indicating the number of times each element 

FIG. IB depicts a Static Instrumemor implementation of within a data table identified as being of interest in the data 

the inventive mechanism according to a second embodiment coverage specification 102 has been accessed during one full 

of the present invention. In this second of the three run of the executable image 109. Once execution of the 

embodiments, the program source code 101 is fed into a 30 executable image 109 is finished, the coverage information 

traditional compiler with debug facilities 108. The debug 106 is complete. The coverage information 106 is then fed 

facilities present in the compiler 108 help guide the instru- to the visualization tool 107 which produces a visually 

mentation process by providing information about the size demonstrative display of the coverage information 106 

of the data tables that are of interest as well as the size of FIG. 2 depicts a data coverage specification according uH 

? h f/^ apreferredembodimentofthepresentinvention.THeTface 

™£5 fit f' ^ C ° mP ^ ^ r J^ belted to having <TABLE> directive 201 acts to command that function 

symbo table information which would indicate where a names contained within "FUNCTION-LIST" be instru-. 

in thelable ^ ^ ° f ^ dement mented S ° aS t0 C0Unt the number of times elements within ' 

l " C a C *. 1 im) . 40 the dala table "TABLE" are accessed by program instruc- 

The compiler 108 generates an executable image with the tions associated with functions contained within function list 

incorporated debugging information 109. Then, the execut- 203 which is here simply called "FUNCTION-LIST" The 

able image 109 and the data coverage specification 102 are function list 203 is then defined, listing the names of the 

ted into the static instrumentation tool 110. In this functions which will be instrumented for memory access 

embodiment, the static instrumentation tool 110 contains the tracking during execution. For the generic example provided 

inventive instrumenter 103. in element 201, only code associated with the functions 

The static instrumentation tool 110 contains code which contained within "FUNCTION-LIST" will be instrumented 

searches through executable code of executable image 109 for the purpose of tracking memory access 
instrumenting instructions which are of interest according to A more specific example is provided where the "tracT 

t^^^^x ' thereby producme the » c - re i ve 204 i ™ v wo wiication 

c *u- • c 7 functions, "emit_simple" and "emit_imm8" For this direc- 

From this point forward, this mechanism operates exactly live 204, the completed coverage information will indicate 

as described for the first mechanism above, producing the number of times each element of the data table or array 

coverage ^formation 106 to be correlated with the program "OPCODE" is accessed when the application is executed It 

source code 101 and then fed into the visualization tool 107 5S is noted that for the example given in 204, the mechanism 

which displays the coverage information 106 in a visually will only instrument, and therefore only track memory 

illustrative manner. access fofj code associated with the functions emit_simple 

Itie second mechanism for instrumenting the application and emit_imm8, thereby considerably limiting the compu- 

program, depicted in FIG. IB, is preferred over the first tation time needed to accomplish the instrumentation This 

when it is desired to instrument code generated by a number 6 o result will provide information available to a developer 

of different compilers, possibly from different compiler indicating whether all of the machine instructions whose 

vendors. The second mechanism may also be preferable over opcodes are encoded in the table called "OPCODE" have 

the first if commercial interests are enhanced by selling the been read from the opcode table during the execution of the 

compiler and instrumentation tool as separate products application. 

rather than as a unified product. „ r i n „ . 4 .. . , 

cir^ 10 j • „ l ■ , 65 H0 2 lllust rates one possible embodiment of a data 

HG.1C depicts the Dynamic Instrumentation implemen- coverage specification for screening instructions for 

tation of the inventive mechanism according to a third memory access instrumentation. The object of such screen- 
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ing is to reduce the required computational burden of the At step 313, the dynamic instmmentor takes as input the 

instrumentation while still providing valuable memory program code 101 and data coverage specification 102, and 

access information. As an alternative to identifying specific parses the coverage request. Operating on the example of 

functions to be instrumented for memory access, the pro- FIG. 2, the parser would identify two code regions of 

gram instructions could be identified by leaving codes in the 5 interest, the functions "emit_simple" and "emit_imm8", 

program, identifiable to the compiler, which would flag and one memory region of interest (or data region of 

selected instructions for instrumentation. interest), the table labeled "opcode". 

In an alternative embodiment, the mechanism would At step 314, the dynamic instrument builds coverage 

identify data tables of interest having certain ranges of tables to be included in the instm mentation code. By way of 

memory locations, and count memory accesses to the 10 example, if the data table named "Opcode" had 100 

selected ranges of memory locations by instrumenling all elements, the instrumentor would build a coverage table or 

program instructions containing accesses to memory loca- data coverage table, corresponding to the Opcode table, also 

lions within the range of the identified data tables (memory containing 100 elements whose values are initialized to zero, 

regions of interest. This approach employs a memory loca- The individual elements of the data coverage table will be 

tion centered approach, and is more computationally expen- 35 use d to count the number of memory accesses to the 

sive than the data coverage specification embodiment illus- corresponding elements of the data table lo which access is 

trated in FIG. 2 since the mechanism would have to check being tracked. 

memory access activity by all instructions capable of access- At step 320, the dynamic instmmentor emulates the 

ing a memory region of interest instead of checking such original program instructions along witn instrumentation 

access only for a restr.cted set of instructions limited by 20 code . ^ process of emulation 32r } is described fa ' ™ 

criteria in the data coverage specification. detail in f, g . 3C . Finally, at step 315, the dynamic instm- 

It is noted that the data coverage specification could be mentor emits coverage information, thereby filling out the 

considerably more complex, and more or less restrictive, data coverage tables. 

than the embodiments discussed above, or it could be FIG. 3C depicts (he emulation of program instructions 

omitted enUrely.Tlie data coverage specification is included * using instrumentation according to the Ihird embodiment 

to reduce the required amount of instrumentation. In the (dynamic instrumentation) of the present invention 320. At 

absence of a data coverage specification, the mechanism step 321, the inventive mechanism retrieves the first pro- 

would check every line of code in a program for memory grara instruction. At step 322, the mechanism emulates the 

access, and keep count of all the accesses to the memory instruction. 

locations encountered during execution of the application 30 *, cl( ,„ ^ a. „u. • ■ . . .u • 

program. The programmer could subsequently limit scrutiny , ^ lnS mm en f .^miction 

of the data coverage information output by L program .0 ^ d ', ng . t0 . the P'°f dure shown in detail ,n FIG. 6. 

the memory regions of interest. U A1 decislon block 323 > 'he mechanism determines 

FIG. 3A depicts the process of inserting instrumentation ^[J^JjT^ Tf^lt I ^ P ^ * 

code into an application program using The embodiments 35 2' ZZlZ^t J £ ■* T 

depicted in FIGS. 1A and IB, and previously discussed in °7r f ,he ) mech f amsm S e,s lhe ™* instruction ln the 

this specification in connection with (hose FIGS. of . eXeCU "° n ° f the P ro S™ n in ste P 324 ' 

At step 301, the instrumentor takes as input the program , J£ m ^ a T T^T a data covcr age 

code 101 and data coverage specification 102, and parse! the ™^Z t tTT*, 7 t h JT" ^ T^a 

coverage request. Operating on the example of FIG. 2, the 40 ^rc HnTi^, , f^T* 

parser would identify two code regions of code of interest, ™T f ,7 u operations °! 

the functions "emitJsimple" and 'Wit_imm8", and on 't^Zm* 't* AM^Kr T 

memory region of interest (or data region of interest), the m FIGS ' * A ^ D 1B , 6 ^scribes the 

table labeled "opcode" instrumentation process for the embodiment depicted in 

4, cl .„ lm ik.i.. ( l m ^„imu mj . ., 45 FIG. 1C, that of dynamic instrumentation. The data coverage 

At step JU2, the instmmentor 300 builds coverace tables ,„,i„,„ • , 

to be included in the instrumentation code. By way of n f wh ^ ™ m "T7 reg '° n T ?™ l°* '"T^ 
example, if the data table named "Opcode" ^ad \oO frl '"f , 7 T " , read °P eratl ° ns 

elements, the instrumentor would build a coverage table, or fo" ^KX? 2 T, T ' "r ' f "**" 

Anin . „ . . , , . _ , b , , \ Ior maintaining track or read operations from specific data 

data coverage table, corresponding to the Opcode table, also -i^m^to „.'tu- *u • c - . 

™n* fnn 1 * L 1 - ■ . ,. T 50 elements within the memory regions of interest, 

containing 100 elements whose values are initialized to zero. ^ A J B . 

At ^nft ■ * . j ■ , ™ Ane data access recorder can include a data access ana- 

At step 400, instrumentation code is generated. The i™, f„ r ;„ e *„ im *■ • * <• 

inctntm,.^^. ' P _ A - ■ & . , , v 2 " tor instrumenting program uistructions which access 

instrumentor searches for code in regions associated with m „ m i r • * 7 ^ j , 

functions "emi,_simple" and "emiUmmS", and adds rWomnri^Tw f T^f 

instrumentation to lines of code in these regions which « fo„ c ,; nn ?nTn? V 1 I * T 

access memory. The process of instrumenting 400 is H f'° P\°SX™> ™<* » 

described in detail in FIG 4 S data C0V ™&°P 1 ™™ f« ensunng that only instructions 

A.o.„ma • . . .•' , . j j j , ... associated with said functions of interest specified or listed 

At step 303 instrumentation code » added to the or.g.nal in , hc data coverage specification are instrumented to check 

ZfcrZtZ J^£Tr gC Creal6d l ° St ° re f0r acce&s to memor V locations of in,er6St - Inslrumenting 

info mation relatrng to data coverage. 60 such a restrjcled set of inslructions reduces the tota , com ^ 

At step 304, termination code is added to the program. putational burden placed upon the instrumenting program. 

The termination code writes the coverage information to a The function list specification is a subset of the data covcr- 

hle tor later examination. age specification and identifies the instructions associated 

FIG. 3B depicts the process of inserting instrumentation with functions of interest, 
code into an application program employing a dynamic 65 The inventive system comprises a coverage instruction 

instrumentor as depicted in FIG. 1C and discussed in con- locator which identifies instructions which are both associ- 

neclion therewith. aled with f unc ,; ons specified in the data coverage 



01/31/2003, EAST Version: 1.03.0002 



US 6,430,741 Bl 

9 10 

specification, and access data in the memory region of terminates, and execution proceeds to the next instruction in 

interest. A data coverage reporter is available for counting program execution order 

the number of times each data element in the memory region If the address ^ ta the traced region st 504 determines 

of interest .s accessed by the instrumented mstructions. me offeet of this memory |ocation from of the data 

The data coverage reporters may include data coverage 5 table- in memory as defined by the data coverage specinca- 

tables having elements which correspond to data elements in tion 102. The offset number determined in step 504 divided 

the memory regions of interest, wherein each element in the by. the size of each data item determines the index of the 

data coverage table serves as a counter. The data coverage counter (created in step 302) to be incremented in step 505 

reporter may further include a data element resetter for At step 505, the instrumentation code increments the counter 

initializing the counters to zero after execution of a program 10 identified in step 504. Next, at step 506, execution proceeds 

has concluded. Data element adjustors are available for to. the next instruction in program execution order 

irf™ H g h C0UnteiS t When ?f aSS0CiatCd elementS U ^° n com P letion of ™^™> each c °™ter ^u\d have 

are accessed by an instrumented program instruction. a value equal to the total number of limcs [{s corrcsponding 

FIG. 4 depicts the program instrumentation procedure 400 data element was accessed, or read from, during program 

for both the instrumenting compiler and static instrumentor 15 execution. Any counter having a value of 0, thereby indi- 

according to a preferred embodiment of the present inven- eating a null access condition, after program execution 

tion. This procedure analyzes each instruction of the pro- would trigger attention from the developer since the memory 

gram and decides where to insert instrumentation code. location associated with that counter has not yet been tested. 

At step 401, the procedure 400 retrieves a program The null access tracker would act to report data elements in 

instruction from the user program. At step 402, the proce- 20 the memory region of interest which have not been accessed 

dure 400 determines whether the instruction is in a traced at all. Counter value data is then dumped out to a coverage 

function as specified in the data coverage specification 102. file after execution of the instrumented program. There is a 

If the instruction is not in a traced function, execution facility for merging the data coverage files resulting from 

continues at step 406 which retrieves the next user program different runs of the instrumented program, 

instruction. 25 The following discussion applies to the embodiments 

If the instruction is in a traced function, execution con- depicted in FIGS. 1A, IB, and 1C. The coverage information 

tinues at step 403 which determines whether the program 106 is read from the merged file using a visualization tool 

instruction 401 accesses memory. If there is no memory 107 which displays the number of times each element in the 

access in the instruction, the procedure 400 retrieves the next data table has been accessed. The visualization tool acts to 

instruction in step 406. If the program instruction 401 does more clearly illustrate the number of times each element in 

access memory, execution continues at step 404. the table has been accessed. One approach to visualization 

At step 404, the procedure 400 determines whether the would be to represent different ranges of access in different 

memory location accessed by program instruction 401 is colors. In a preferred embodiment, Black would be used to 

potentially traced according to the data coverage specifica- 35 indicate a high access level, Pink to indicate a low access 

tion 102. Here, "potentially traced" means potentially read level, and Red to indicate unaccessed items, 

from a memory region of interest. If the memory location is Alternatively, a wide range of different colors could be 

not potentially traced, the procedure 400 retrieves the next used to indicate the various access levels. Although three 

instruction at step 405. If the memory location is potentially levels of data access are discussed in connection with the 

traced, execution continues at step 405. A memory location preferred embodiment, any number of data access levels 

is potentially traced if it cannot be ascertained with certainty \ould be employed without departing from the scope of the 

that the read from memory detected in step 403 is outside the present invention. 

memory region of interest according to the data coverage The above approach will identify for the developer 

specification. Within the data coverage specification, a elements in the data table which have not been accessed by 

ZZll IT?' f* e ^ ca V° n i u di u at£S WhiC ? 1 " reaS ° f 45 the application program in the course of running the test 

memory are of interest and to which access will be mom- suite. With this information, the developer may either 

° r ' modify the test suite to ensure that all elements in the table 

At step 405, the procedure 400 inserts dynamic tracing are accessed, or examine the unaccessed elements by hand 

code. At this stage, the dynamic tracing code is inserted but to ensure that they are correct. 

not executed because information necessary for executing 50 FIG. 6 depicts the instrumentation procedure 600 for the 

the tracing code will not be available until run-time. The dynamic mstrumentor. At step 601 the next instruction in the 

dynamic tracing code is described in greater detail in con- execution order is retrieved for instrumentation 

nection with FIG. 5. T r 4 , . , 

a | dft/C ( , . ln the cases of the instrumenting compiler embodiment 

At step 406, the next instruction in program execution which is discussed in connection with FIG. 1A, and the static 

° r CT^ IS /^ qU L 55 instrumentor embodiment discussed in connection with FIG. 

HG 5 depicts the execution of dynamic tracing code 500 IB, the instrumentation process is conducted in two separate 

tor both the instrumenting compiler and static instrumentor. phases: a static portion performed prior to execution 

Instruction 501 has already been screened for various cri- depicted in FIG. 4, discussed in connection therewith and a 

tena in the program instrumentation procedure 400. FIG. 5 run-time portion, depicted in FIG. 5, and discussed in 

depicts the run-time execution of the tracing code inserted in 60 connection therewith. 

step 405. These two separate phases exist for the static approaches 

At step 502, the instrumentation code reads the memory (the instrumenting compiler and the static instrumenter) 

address accessed by the instruction 501. At step 503, the because a first instrumentation step is performed prior to 

instrumentation code determines the whether memory execution of the code at which time certain information is 

address read in step 502 is in the traced region (memory 65 not yet available, such as the precise element of a table to be 

region of interest) or not. If the address is outside the traced accessed by an instruction. Later, during execution of the 

region, the instrumentation code for the current instruction code, with the all the required information available the 
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dynamic tracing code insertion procedure (depicted in FIG. 
5) is executed. Thus, FIG. 4 depicts the pre-execution, static 
portion of the instrumenter, and FIG. 5 depicts the run-time, 
dynamic portion of the instrumenter. 

For the dynamic instrumentor depicted in FIG. 1C and 
discussed in connection therewith, the entire instrumentation 
procedure occurs at run-time thereby merging the function- 
ality of FIGS. 4 and 5. Some operations performed for the 
static instrumentation approaches in FIGS. 4 and 5 end up 
being condensed in the instrumentation procedure employed 
for the dynamic instrumenter embodiment. The resulting 
dynamic instrumentation procedure is depicted in FIG. 6, 
and discussed in the following. 

At decision block 602, the instrumentation code deter- 
mines whether the instruction is in a traced function. If the 
instruction is not in a traced function, execution resumes at 
step 607. If the instruction is in a traced function, execution 
proceeds at decision block 603. 

At decision block 603, the instrumentation code deter- 
mines whether the instruction conducts a read from memory. 
If there is no read from memory, execution resumes at step 
607. If there is a read from memory, execution proceeds with 
decision block 604. 

At decision block 604, the instrumentation code deter- 
mines whether the read address is in a traced region or not. 
The traced region corresponds to the memory region of 
interest defined in the data coverage specification 102. If the 
read address is not in a traced region, execution resumes at 
step 607. If the read address is in a traced region, execution 
proceeds at step 605. 

If the address is in the traced region, step 605 determines 
the offset of this memory location from the base of the data 
table in memory as defined by the data coverage specifica- 
tion 102. The offset number determined in step 605 divided 
by the size of each data item determines the index of the 
counter (created in step 314) to be incremented in step 606. 
At step 606, the instrumentation code increments the counter 
identified in step 605. Next, at step 607, execution proceeds 
to the next instruction in program execution order, 

FIG. 7 depicts a computer system 700 adapted to use the 
present invention. Central processing unit (CPU) 701 is 
coupled to bus 702. In addition, bus 702 is coupled to 
random access memory (RAM) 703, read only memory 
(ROM) 704, input/output (I/O) adapter 705, communica- 
tions adapter 711, user interface adapter 708, and display 
adapter 709. 

RAM 703 and ROM 704 hold user and system data and 
programs as is well known in the art. I/O adapter 705 
connects storage devices, such as hard drive 706 or CD 
ROM (not shown), to the computer system. Communica- 
tions adaptor 711 couples the computer system to a local, 
wide-area, or Internet network 712. User interface adapter 
708 couples user input devices, such as keyboard 713 and 
pointing device 707, to the computer system 700. Finally, 
display adapter 709 is driven by CPU 701 to control the 
display on display device 710. CPU 701 may be any general 
purpose CPU, such as a HP PA-8200. However, the present 
invention is not restricted by the architecture of CPU 701 as 
long as CPU 701 supports the inventive operations as 
described herein. 

Although the present invention and its advantages have 
been described in detail, it should be understood that various 
changes, substitutions and alterations can be made herein 
without departing from the spirit and scope of the invention 
as defined by the appended claims. 



12 



15 



20 



25 



30 



35 



40 



45 



55 



What is claimed is: 

1. A method for analyzing data coverage of a computer 
program, the method comprising the steps of: 

identifying a region of interest in computer memory, said 
region of interest being a memory region of interest and 
having a plurality of data elements; and 

determining an extent of access to said memory region of 
interest by said computer program, thereby providing 
data coverage information, wherein the step of deter- 
mining comprises the step of identifying any data 
elements within said memory region of interest not 
accessed during execution. 

2. The method of claim 1, wherein the step of determining 
comprises the steps of: 

identifying read operations by computer program instruc- 
tions to the data elements within the memory region of 
interest; and 

maintaining track of reads from data elements during 
execution of the computer program, thereby indicating 
which data elements have and have not been read from 
during execution of the computer program. 

3. The method of claim 2, wherein the step of maintaining 
track of reads from the data elements comprises the step of: 

instrumenting a subset of computer program instructions 
to check for read operations from data elements in the 
memory region of interest for computational efficiency. 

4. The method of claim 3, wherein the step of instrument- 
ing a subset of computer program instructions further com- 
prises the step of: 

marking instructions to be instrumented with codes rec- 
ognizable to a compiler, thereby generating marked 
instructions; and 

instrumenting said marked instructions to check for read 
operations from data elements in the memory region of 
interest, thereby obviating a need to instrument instruc- 
tions which are not marked and providing computa- 
tional efficiency. 

5. The method of claim 3, wherein the step of instrument- 
ing a subset of computer program instructions comprises the 
steps of: 

identifying functions of interest within the computer 
program; and 

limiting said step of instrumenting to instructions associ- 
ated with said functions of interest within said com- 
puter program, wherein said instructions associated 
with said functions of interest are associated instruc- 
tions. 

6. The method of claim 5, further comprising the steps of 
identifying associated instructions which access data ele- 
ments within the memory region of interest; and 

counting the number of times each data element in the 
memory region of interest is accessed by said associ- 
ated instructions during execution of the computer 
program. 

7. The method of claim 5, wherein the step of instrument- 
ing is performed in an instrumenting compiler. 

8. The method of claim 5, wherein the step of instrument- 
ing is performed in a static instrumentor. 

9. The method of claim 5, wherein the step of instrument- 
ing is performed in a dynamic instrumentor. 

10. The method of claim 2, further comprising the steps 

of: 

creating data coverage tables having coverage elements 
corresponding to the data elements in the memory 
regions of interest, said coverage elements in said data 
coverage tables being counters; 
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initializing said counters to zero; and 

incrementing a counter corresponding to a data element in 
a memory region of interest by a single count whenever 
said data element is accessed by the computer program. 

11. The method of claim 10, comprising the further steps 5 

of: 

establishing a final count for each said counter; 

producing a data coverage report indicating the final count 
for each said counter; and 

color coding said report based on a numerical value of the 
final count for each said counter, thereby indicating the 
relative access frequency of the data elements in the 
memory region of interest in a visually illustrative 
manner. 35 

12. A system for analyzing data coverage of a computer 
program, the system comprising: 

a memory region specification for identifying a region of 

interest in computer memory, said region of interest 

being a memory region of interest and having a plu- 20 

rality of data elements; and 
a data coverage analyzer for determining an extent of 

access to said memory region of interest, wherein the 

data coverage analyzer comprises 

a memory region analyzer for identifying read opera- 25 
tions by computer program instructions to the data 
elements within the memory region of interest, and 

a data access recorder for maintaining track of reads 
from the data elements during execution of the 
computer program, thereby indicating which data 30 
elements have and have not been read from during 
execution of the computer program. 

13. The system of claim 12, wherein the data coverage 
analyzer comprises: 

a null access tracker for identifying any data elements 35 
within said memory region of interest not accessed 
during execution. 

14. The system of claim 13, wherein the data access 
recorder comprises: 

a data access analyzer for instrumenting a subset of 40 
computer program instructions to check for read opera- 
tions from data elements in the memory region of 
interest for computational efficiency. 

15. The system of claim 14, wherein the data access 
recorder further comprises: 
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a function list specification for identifying functions of 
interest within the computer program; and 

a data coverage optimizer for limiting instrumentation to 
instructions associated with said functions of interest 
within said computer program, wherein said instruc- 
tions associated with said functions of interest are 
associated instructions. 

16. The system of claim 15, further comprising: 

a coverage instruction locator for identifying associated 
instructions which access data elements within the 
memory region of interest; and 

data coverage reporters for counting the number of times 
each data element in the memory region of interest is 
accessed by said associated instructions during execu- 
tion of the computer program. 

17. The system of claim 16, wherein the data coverage 
reporters comprise: 

data coverage tables having coverage elements corre- 
sponding to elements in the memory regions of interest, 
said coverage elements in said data coverage tables 
being counters; 

data element resetters for initializing said counters to 
zero; and 

data element adjustors for incrementing a counter corre- 
sponding to a data element in a memory region of 
interest by a single count whenever said data element is 
accessed by the computer program. 

18. A computer program product having a computer 
readable medium having computer program logic recorded 
thereon for analyzing data coverage of a computer program, 
the computer program product comprising: 

a memory region specification for identifying a region of 
interest in computer memory, said region of interest 
being a memory region of interest and having a plu- 
rality of data elements; and 

a data coverage analyzer for determining an extent of 
access to said memory region of interest, wherein the 
data coverage analyzer comprises a null access tracker 
for identifying any data elements within said memory 
region of interest not accessed during execution. 
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